PLOS Biology
● Public Library of Science (PLoS)
Preprints posted in the last 90 days, ranked by how well they match PLOS Biology's content profile, based on 14 papers previously published here. The average preprint has a 0.04% match score for this journal, so anything above that is already an above-average fit.
Bukhari, K.; Rodriguez-Monguio, R.; Lopez-Bermudez, B.; Yamaki, J.; Beuttler, R.; Ong, J. C. L.; Brown, L. M.; Seoane-Vazquez, E.
Show abstract
IntroductionClinical and population decision-making relies on the systematic evaluation of extensive regulatory evidence. The FDA drug reviews provide detailed information on clinical trial design, enrollment criteria, sample size, randomization, comparators, endpoints, and indications. However, extracting these data is resource-intensive and time-consuming. Generative Artificial Intelligence large language models (LLMs) may accelerate the extraction and synthesis of such information. This study compares the performance of three LLMs, ChatGPT-4o, Gemini 2.5 Pro, and DeepSeek R1, in extracting and synthesizing regulatory and clinical information using antibiotics approved for complicated urinary tract infections (cUTIs) between 2010 and 2025 as a case study. MethodsLLM models were evaluated using general (short, direct) and detailed (structured, guidance-referencing) prompts across five domains including accuracy (precision and recall), explanation quality, error type (hallucination rate, misclassification, and omission), efficiency (response time, correct answers per second, and seconds per correct answer), and consistency with responses generated in duplicate runs. Two investigators independently reviewed outputs against FDA guidance, resolving discrepancies by consensus. Statistical analyses included {chi}{superscript 2}, Wilcoxon, and Kruskal-Wallis tests with false discovery rate correction. ResultsAmong 324 responses, accuracy differed significantly across models ({chi}{superscript 2}, p<0.001) with Gemini 2.5 Pro achieving the highest accuracy (66.7%), followed by ChatGPT-4o (51.9%) and DeepSeek R1 (37.0%). General prompts outperformed detailed prompts (59.3% vs 44.4%; p=0.011). Gemini 2.5 Pro showed highest explanation quality and most consistent outputs, while ChatGPT-4o had shortest response times and highest efficiency. Hallucination was the most frequent error type across models. ConclusionLLMs showed variable capability in extracting regulatory and clinical trial information. Gemini 2.5 Pro showed the strongest overall performance, while ChatGPT-4o was faster but less accurate, and DeepSeek R1 underperformed across most domains. These findings highlight both the promise and limitations of LLMs in regulatory science and support complementary use alongside human review to streamline evidence synthesis. Author SummaryOur research addresses a critical question in artificial intelligence for healthcare: how well do generative Generative Artificial Intelligence (GenAI) tools extract and synthesize regulatory and clinical information to inform decision-making? We assessed ChatGPT-4o, Gemini 2.5 Pro, and DeepSeek R1 performance in extracting and synthesizing information from regulatory documents and clinical trial data using all FDA approved antibiotics for the treatment of complicated urinary tract infections. We compared LLMs outputs directly with the original data sources. We assessed the models performance using both broad and detailed prompts across several areas, including accuracy of the information (precision and recall), quality of explanation, type of errors (hallucination, misclassification, and omission), efficiency and speed (response time, correct answers per second, and seconds per correct answer), and consistency of responses across repeated runs. The results suggest that while the models were generally fast and efficient extracting large volumes of information, they also produced errors and omissions that could limit their reliability. These findings highlight both the promise and the current limitations of GenAI, underscoring its potential value as a human supervised tool for safely supporting regulatory science and clinical decision-making.
Corpas, M.
Show abstract
BackgroundDespite representing approximately 85% of the global population, low- and middle-income countries (LMICs) contribute less than 10% of participants to genome-wide association studies and biobank research. This disparity has profound implications for the generalisability of precision medicine. However, no standardised framework exists to quantify research equity at the biobank level or track progress over time. MethodsWe developed the Health Equity Informative Metrics (HEIM) framework to quantify alignment between biobank research output and global disease burden. We analysed 75,356 PubMed-indexed publications (2000-2025) from 27 biobanks across 19 countries, mapping each to 179 disease categories from the Global Burden of Disease Study 2021. We calculated disease-specific Gap Scores measuring the mismatch between burden (disability-adjusted life years, DALYs) and research attention, biobank-level Equity Alignment Scores (EAS), and regional equity ratios comparing high-income (HIC) to LMIC research intensity. FindingsWithin our 27-biobank sample, HIC biobanks produced substantially higher research output per DALY compared to LMIC biobanks (ratio: 322:1; sensitivity analyses: >100:1 across methodological variations). Regional concentration was marked: the Americas and Europe accounted for 97.8% of publications, while Africa, Eastern Mediterranean, and South-East Asia combined contributed <1%. Of 179 disease categories, 23 (13%) exhibited critical or high-severity research gaps despite substantial global burden. Only 4 of 27 biobanks (15%) achieved Strong or Moderate equity alignment scores; 19 (70%) were rated Poor. Six disease categories showed critical gaps, including drowning (15.7 million DALYs, 0 mapped publications) and iodine deficiency (2.3 million DALYs, 10 publications). InterpretationThe HEIM framework reveals substantial disparities in how biobank research capacity is distributed relative to global disease burden. While the precise equity ratio varies with sample selection and methodology, the fundamental pattern of profound inequity is robust across reasonable analytical choices. These findings provide baseline measurements for tracking progress toward more equitable genomic research and identify high-priority targets for intervention.
Granitto, M.; Kim, E.; Forney, C. R.; Yin, C.; Diouf, A. A.; VonHandorf, A.; Dexheimer, P. J.; Parameswaran, S.; Chen, X.; Donmez, O. A.; Rowden, H.; Swoboda, C. O.; Shook, M. S.; Dunn, K.; Kebir, H.; Velez-Colon, M.; Kaufman, K.; Ho, D.; Laurynenka, V.; Edsall, L. E.; Brennan, V.; Gewurz, B. E.; Namjou, B.; Wilson, E.; Fisher, K. S.; Zabeti, A.; Lawson, L. P.; Alvarez, J. I.; Kottyan, L. C.; Weirauch, M. T.
Show abstract
BackgroundMultiple sclerosis (MS) is an immune-mediated demyelinating disease of the central nervous system affecting 2.8 million people worldwide. Both genetic and environmental factors contribute to MS risk, with Epstein-Barr virus (EBV) infection being an important environmental factor. To better clarify the role of EBV in MS, we examined its impact on gene expression, chromatin accessibility, and transcription factor binding in primary B cells and EBV-transformed B cells derived from patients with MS and healthy controls. ResultsRNA-seq and ATAC-seq analyses revealed extensive MS-dependent gene expression and chromatin accessibility differences in EBV-transformed, but not in primary B cells. These changes are largely accounted for by the expression levels of EBNA2, an EBV-encoded transcriptional regulator previously implicated in MS. ChIP-seq analysis revealed that EBNA2 binding with its interacting human partners RBPJ, EBF1, and PU.1 is highly enriched at MS genetic risk loci, with extensive EBNA2 allelic binding and increased enrichment at MS genetic risk loci in MS-derived cells. ConclusionsOur findings demonstrate that enhanced EBNA2 activity in MS alters human gene expression, chromatin accessibility, and transcription factor binding in an MS-dependent manner. Collectively, this study provides new insights into the molecular mechanisms through which EBV, particularly EBNA2, interacts with host genetic risk to contribute to MS pathogenesis.
Maupin, D.; Suchak, T.; Sengupta, A.; Marra, M.; Geifman, N.; Spick, M.
Show abstract
The growth of generative AI and easily available Open Access health datasets has transformed researcher productivity, leading to an explosion in publications that has in part been attributed to paper mills (organisations that provide manuscripts for payment) and other unethical actors. These entities are not, however, homogenous, and have a range of products and target markets. While the demand from China has received much attention, here we provide a case study of CDC WONDER, a dataset that has been exploited by a network of researchers reporting affiliations in Pakistan, the United States and the UK, potentially linked to medical residency driven demand from junior clinicians or trainees. The number of publications using CDC WONDER grew from 88 in 2021 to 1223 in 2025. Over the same time period, the proportion of papers reporting at least one author from Pakistan grew from 0.5% in 2021 to 27.2% in 2025, with unusually extensive collaboration networks. In some cases these works featured over 15 co-authors, often including representation from Western institutions, but in spite of this high level of resourcing only resulted in straightforward analyses of well-described conditions using publicly available data. The majority of these outputs additionally show evidence of being produced from a template, with formulaic titles and identical methods, for example using the same statistical model and platform (Joinpoint regression). Identifying papers produced by fast-churn workflows is essential to protect the integrity of the scientific literature from being flooded with low-quality research. This can be achieved through more proactive desk rejection of misleading and formulaic mass-produced submissions, and through better understanding of which use cases are appropriate for different Open Science resources. With the growing capabilities of AI to mass produce research, education will be essential to assist critical appraisal and preserve the benefits of Open Science.
Russo, L.; Lentini, N.; Soru, L.; Pastorino, R.; Boccia, S.; Ioannidis, J.
Show abstract
The terms personalized, individualized and precision medicine are increasingly used to describe health interventions, yet their operational meaning in clinical research remains unclear. Despite extensive conceptual discussion, there is limited empirical evidence on how these labels are applied in randomized controlled trials (RCTs) and whether such trials meet standards of transparency and methodological rigor. We systematically examined 262 RCTs published between 2020 and 2022 that used the terms "personalized", "individualized", or "precision" in the title to describe an intervention. The term "personalized" was used most frequently (49.2%), followed by "individualized" (45.8%) and "precision" (5.0%). In most trials, personalization involved behavioral, digital, or pharmacological interventions, with few studies employing -omics approaches. Personalization was most often based on individual lifestyle factors, psychological characteristics, or disease classification. We also found that in most trials, personalization consisted of tailoring a single intervention to individuals (82.8%), often through individualized dosage (73.2%). Most included RCTs were judged to be at high risk of bias and showed limited transparency with respect to data and code sharing. Our study suggests that, in contemporary RCTs, the labels "personalized", "individualized", and "precision" are applied interchangeably to a wide range of heterogeneous interventions that are predominantly non-genomic. Greater conceptual clarity and stronger methodological standards are needed to ensure that claims of personalization in clinical research are empirically meaningful and reliable.
Siebert, M.; Caquelin, L.; Naudet, F.; Ross, J. S.; Ramachandran, R.
Show abstract
BackgroundThe strength and transparency of clinical trial evidence supporting drug approvals has become increasingly scrutinized, particularly considering the increased use of regulatory flexibility and expedited pathways. While U.S. Food and Drug Administration (FDA) standards have been extensively analyzed, evidence standards at the European Medicines Agency (EMA) remain less well-characterized. Thus, this study aims to systematically assess the design, quality, and outcomes of pivotal efficacy trials supporting EMA drug approvals between 2020 and 2023. MethodsWe conducted a cross-sectional analysis of new medicines and biosimilars receiving positive opinions from the EMAs Committee for Medicinal Products for Human Use (CHMP) and subsequent approval by the European Commission between January 2020 and December 2023. Data were extracted from European Public Assessment Reports (EPARs) and EMA medicine databases. Key variables included trial design features, primary endpoint type and achievement status, and justification for approval in cases of failed efficacy endpoints. ResultsBetween 2020 and 2023, 232 drugs were approved by the EMA for 281 indications. Of these, 205 (88.4%) were new active substances and 65 (28.0%) were granted orphan designation. Forty-six products (19.8%) were approved via a special regulatory program, most commonly Conditional Approval (26 products; 11.2%). Cancer was the leading therapeutic area, accounting for 61 approvals (26.3%). Approvals were supported by 393 pivotal clinical trials. Of these, 327 (83.2%) were randomized controlled trials (RCTs) and 218 (66.6% of RCTs) had a superiority design. A total of 232/393 trials (59.0%) relied on surrogate endpoints. Overall, 22 approvals (9.5%) were supported by at least one pivotal trial in which at least one primary endpoint was not met; in seven of these cases (31.8%), the failed trial was the sole pivotal trial. The most common rationale for approval despite null primary results was reliance on the totality of evidence, secondary endpoints, or clinical judgment (9 products; 40.9%). ConclusionsOur findings reveal substantial variability in the design and evidentiary strength of pivotal trials supporting EMA approvals between 2020 and 2023. While the majority of studies were RCTs, reliance on surrogate endpoints was common. That 10% of approvals were based on pivotal trials with null primary endpoints highlights the nuanced role of regulatory judgment in therapeutic evaluation. These findings prompt reflection on evolving evidence standards in drug regulation and underscore the need for transparency and consistent justifications.
Curtin, M.; Wiltshire, A.; Nilsonne, G.; Siebert, M.
Show abstract
ObjectiveAntimicrobial resistance (AMR) is an urgent global health threat, resulting in more than 5 million deaths globally in 2019. Timely and complete antimicrobial agent (AMA) clinical trial results reporting is essential to evaluate the safety and efficacy of investigational therapies. The Food and Drug Administration Amendments Act (FDAAA) of 2007 mandated results reporting for applicable clinical trials to ClinicalTrials.gov. After nearly ten years of underreporting, the HHS issued the Final Rule, requiring a designated responsible party to submit results to ClinicalTrials.gov and clarifying applicable clinical trial (ACT) criteria. ACTs and probable ACTs (pACTs) are interventional studies regulated by the FDA with at least one site based in the United States. However, pACTs were initiated prior to January 2017, when the Final Rule came into effect. This study investigates the compliance and timeliness of results reporting of ACTs and probable ACTs (pACTs) for AMAs. DesignWe extracted data from ClinicalTrials.gov for trials involving AMAs with primary completion dates between May 1, 2013, and May 1, 2023. We analyzed the time from primary completion to results reporting and estimated the hazard ratio to compare timeliness between ACTs and pACTs. Additionally, we assessed delays in reporting across different study types and funding sources. ResultsOur search resulted in 2629 NCT records. After exclusion of ineligible trials, we included 2525 trials. We found 1769 pACTs (70.1%; 95% CI, 69.3%-72.9%) and 756 ACTs (29.9%; 95% CI, 28.2%-31.8%). Among the 2525 eligible trials, 2249 trials (89.1%; 95% CI, 87.8%-90.2%) were reported on ClinicalTrials.gov or in journal publications. Overall, 81.3% (95% CI, 79.7%-82.3%) of trials were reported late or missing (75.0% of ACTs vs 83.6% of pACTs). ACTs were more likely to report results earlier than pACTs, with a hazard ratio of 1.4 (95% CI, 1.3-1.5). ConclusionsACTs demonstrated greater reporting compliance and shorter delays in the reporting of overdue results. While this analysis provides initial insights, limitations related to timeline and sample scope suggest that broader investigations are needed to fully evaluate the impact of the Final Rule.
McClean, M.; Koele, S. E.; Dreisbach, J.; Mirold-mei, S.; Njeleka, F.; Mapamba, D.; Mtafya, B.; Phillips, P. P.; De Jager, V. R.; Dawson, R.; Narunsky, K.; Diacon, A. H.; Svensson, E. M.; Heinrich, N.; Casale, F. P.; Hoelscher, M.
Show abstract
Culture-based monitoring of bacterial load is slow and susceptible to missing data, contributing to the length and cost of TB clinical trials. Non-culture-based alternatives, like the Tuberculosis Molecular Load Bacterial Assay (TB-MBLA), could represent a solution. Our objectives were to evaluate TB-MBLA as a biomarker in early bactericidal activity (EBA) studies and explore whether combining biomarkers with joint modelling could provide insight into underlying biological processes and reduce data loss. We generated TB-MBLA (LifeArc(R)) data from sputum samples from all 78 patients from the PanACEA BTZ-043 Phase Ib/IIa trial. In addition, we defined a joint biomarker as the first principal component derived from a probabilistic principal component analysis (pPCA) integrating TB-MBLA, colony forming units (CFU), and time-to-positivity (TTP) data. With TB-MBLA alone and the principal component 1 (PC1) marker we reevaluated the original stage IIa dose-response and stages Ib/IIa pharmacokinetics - pharmacodynamics (PK-PD) exposure-response analyses, applying linear and non-linear mixed models, respectively. For TB-MBLA, we could not detect an exposure-response effect in the PK-PD analysis, in contrast with CFU and TTP. When combining biomarkers, we observed a significant but less pronounced Emax exposure-response between days 0-3 compared with CFU and TTP alone. We also successfully applied pPCA as a modelling framework and show evidence that combining CFU and TTP in a joint latent component can improve detection of treatment effects compared with either biomarker alone. STUDY HIGHLIGHTSO_ST_ABSWhat is the current knowledge on the topic?C_ST_ABSBTZ-043 is a first-in-class antimycobacterial compound with promising applications in drug-sensitive pulmonary tuberculosis. TB EBA trials are typically conducted using sputum culture assays for dose-exposure-response modelling, but the technical challenges of these assays limit their effectiveness. What question did this study address?How does the novel RT-qPCR assay TB-MBLA perform as a dose-exposure response marker for novel antitubercular compound BTZ-043 and what are the effects of joint modelling approaches in EBA modelling tasks? What does this study add to our knowledge?We did not observe a significant exposure response for BTZ-043 in TB-MBLA over the 14-day treatment window. Joint modelling with pPCA can overcome the issues of missing and contaminated data that are canonical of culture data from TB treatment monitoring cohorts. How might this change drug discovery, development, and/or therapeutics?Designing EBA trials should consider the drug-bacterial subpopulation axis of the specific study drug to maximise efficiency. Latent variable modelling techniques can be an effective and efficient framework for modelling Mycobacterial tuberculosis load in EBA trials.
Villani, U.; D'Agate, S.; Saez Lopez, E.; Ramon-Garcia, S.; Della Pasqua, O.
Show abstract
IntroductionBuruli ulcer (BU) is a neglected tropical disease primarily affecting skin and sometimes bone. Standard therapy consists of rifampicin (RIF, once daily) plus clarithromycin (CLA, twice daily) over 8 weeks. Adding amoxicillin-clavulanate (AMX/CLV) may shorten treatment, but predicting treatment success before clinical trial implementation is challenging. AimsTo assess the probability of bacterial eradication following treatment with novel investigational BU regimens over different intervals using a mechanism-based modelling and simulation approach. MethodsIn vitro time-kill assays with RIF, CLA, and AMX/CLV alone and in combination were performed with a range of clinical isolates of Mycobacterium ulcerans. Bactericidal activity was characterized using a bacterial growth dynamics model, including an Emax function to describe the drug effect. Subsequently, clinical trial simulations were performed to evaluate drug disposition and skin penetration in a cohort of virtual subjects, taking into account interindividual variability in pharmacokinetics and pharmacodynamics (n=70/arm). Several regimens, including standard therapy and AMX/CLV-containing combinations with higher RIF doses were assessed. The probability of eradication at 4-8 weeks was assessed across strains with different susceptibility and assuming varying bacterial load at start of treatment. ResultsBeta-lactam containing combinations resulted in higher potency and maximum killing rates relative to the currently recommended regimens. Consequently, regimens containing AMX/CLV with higher RIF doses (20 mg/kg q.d. or 10 mg/kg b.i.d.) outperformed standard therapy, achieving 100% eradication within 4 weeks for baseline loads up to 1,000 CFU/mL across most isolates, except one from China. At higher loads (10,000 CFU/mL), 6 weeks were required. ConclusionsThe use of mechanism-based modelling and clinical trial simulations provides a robust translational framework for the evaluation of novel therapies for neglected diseases, such as BU. Irrespective of differences in bacterial susceptibility, adding AMX/CLV or using RIF-AMX/CLV dual therapy may reduce BU treatment from 8 to 4 weeks.
Shuaibu, I. I.; Khan, M. A.; Alkhamis, D.; Alkhamis, A.
Show abstract
BackgroundSepsis-induced mortality is frequently driven by the systemic dissemination of pore-forming toxins (PFTs), such as Staphylococcus aureus alpha-hemolysin. Biomimetic "nanosponges" which are nanoparticles coated in red blood cell (RBC) membranes have emerged as a promising detoxification strategy. However, current methods rely largely on empirical iteration, often failing to optimize the competitive binding kinetics required to outcompete native RBCs in a high-flow hemodynamic environment. MethodsWe developed a deterministic ordinary differential equation (ODE) kinetic model based on the law of mass action to simulate the competitive inhibition of alpha-toxin by decoy nanoparticles. Unlike prior geometric models, this study explicitly tracked molar receptor concentrations to enforce saturation kinetics and mass conservation. We performed a multi-parametric sweep of nanoparticle radius (r_{NP}: 50-200 nm) and receptor surface density (d_{rec: 200-10,000 sites {micro}m{square}2) to identify the design window that maximizes toxin sequestration efficiency within a clinically relevant timeframe (60 minutes). ResultsBaseline simulations established a native RBC receptor concentration of 3.34 x 10^{-7} M. The optimization landscape revealed a non-linear dependence on receptor density rather than particle size. The optimal design window was identified at a receptor density of >8,000 sites {micro}m{square}2 on an 80 nm vector, achieving a theoretical toxin neutralization efficiency of 91.79%. Notably, complete (100%) neutralization was not observed even under optimized conditions, suggesting a theoretical upper bound imposed by physiological competition. In contrast, standard biomimetic formulations (low-density, 100 nm) achieved suboptimal capture, failing to prevent significant toxin-RBC interaction. ConclusionWe demonstrate that "decoy" efficacy is governed primarily by receptor surface density rather than geometric surface area. Our model suggests that current manufacturing protocols, which prioritize particle stability over receptor enrichment, may be kinetically insufficient for human application. These findings provide a rational design framework for next-generation nanotoxoid therapeutics.
Card, A. J.; Vital, D.; Nebeker, C.
Show abstract
Digital health technologies are powerful-enhancing data collection, participant engagement, and personalized health interventions-yet their rapid proliferation has outpaced guidance for research participant protection. Current practice assists researchers in identifying risks but provides limited support for comprehensive risk management. To address this gap, we developed the Digital Health Checklist-Risk Management (DHC-RM) Tool, which integrates the established Digital Health Checklist with approaches from safety risk management. We conducted a study (n=40) comparing the DHC-RM Tool with current practice using a randomized experimental difference-in-differences design. Primary outcomes were the quantity, variety, and novelty of risks identified; secondary outcomes were the same constructs applied to risk control development. Compared with current practice, use of the DHC-RM Tool resulted in dramatically improved performance across all primary outcomes. Users identified on average 14.7 additional risks (compared to baseline) versus 0.26 in the control group and a higher number of risks in each of six pre-identified risk domains. Half of all distinct risks identified in the comparison phase were identified exclusively using the tool. The tool also improved risk control design, producing 9.63 additional risk control strategies per participant compared with 0.15 for current practice and yielding substantially greater novelty and variety. User feedback was also positive: 75% of participants reported they would use the tool again, citing its structured workflow, just-in-time examples, improved insight into risks, and its value for IRB communication. Suggestions for refinement focused primarily on expanding training examples and providing additional support for risk control development. The DHC-RM Tool significantly improves risk management practice in digital health research. By embedding structured, ethics-informed risk management into digital health research design, the DHC-RM Tool has the potential to improve participant protection while also streamlining ethics approval. Author SummaryDigital health research can put participants (and others) at risk in ways that dont always occur to the researchers who are designing a study. Researchers also face challenges in prioritizing risks and coming up with ideas to reduce those risks. We developed a new approach, the Digital Health Checklist - Risk Management Tool (DHC-RM Tool), to give researchers the support they need to identify, assess, and address research participant risks in this fast-moving field. Our experimental study found that use of the DHC-RM Tool led to a very large improvement in how well researchers managed the risks of digital health research studies. Using the toolkit, they were able to identify more risks than they identified using current practice-including risks they would not otherwise have considered. They were also able to come up with more changes to reduce the risks associated with digital health research studies, including changes they would not otherwise have considered. Those who used the toolkit found it beneficial and easy to use. The DHC-RM Tool fills an important gap in the science and practice of participant protection in digital health research.
Fidalgo, A.; Lemos, F.; Nunes, J. B.; Teixeira, C.; Nogueira, C.; Osorio, H.; Moniz, S. B.; Silva, C. M.; Lemos, C.; Castanheira, P.; Vieira, M.; Madureira, P.
Show abstract
BackgroundWhile antimicrobial resistance is an increasingly urgent problem, with Escherichia coli infections representing a major priority, the development of effective vaccines has proven both challenging and largely unsuccessful. Given accumulating evidence supporting the immunosuppressive role of extracellular bacterial glyceraldehyde-3-phosphate dehydrogenase (GAPDH), we aimed to characterize individual susceptibility to E. coli infections based on the presence of naturally induced antibodies against this protein. MethodsWe conducted an observational case-control study including 62 individuals with E. coli bacteraemia (cases) and 124 age- and sex-matched controls without infection. Detection of GAPDH was performed in plasma samples, and plasma interleukin (IL)-10 levels and anti-GAPDH IgG (titers and concentrations) were quantified. Associations between anti-GAPDH IgG levels and infection were evaluated using logistic regression, and individual disease risk was estimated with a multivariate model incorporating IL-10 detection and low levels of anti-GAPDH IgG. ResultsE. coli GAPDH was detected in the plasma of cases from which purified colonies of E. coli were isolated. IL-10 levels were significantly higher (p < 0.0001) in cases while anti-GAPDH IgG levels were significantly lower (p < 0.0001) comparing with controls. Logistic regression analysis revealed a strong inverse association between anti-GAPDH IgG and the diagnosis of E. coli bacteraemia (adjusted OR = 0.18, 95% CI 0.08-0.37) and a protective threshold of 1.0 g/mL was estimated, below which, 95.2% of cases were classified. In a multivariate logistic regression model, detection of IL-10 was strongly associated with infection risk (OR = 629, 95% CI 139-5355), while low anti-GAPDH IgG showed a trend towards increased risk (OR = 4.26, 95% CI 0.91-30.6). This model allowed the estimation of individual probabilities and absolute risk of E. coli bacteraemia using a combining biomarker based on levels of anti-GAPDH IgG and IL-10. ConclusionsThis study provides the first evidence in humans of the protective potential of circulating anti-GAPDH antibodies, which supports GAPDH as a promising target for an alternative vaccination strategy to prevent E. coli infections.
Choi, J.; Lee, K.; Chavalarias, D.; Shin, J. I.; Ioannidis, J.
Show abstract
ImportanceOver several decades there have been extensive debates on the use and misuse of statistical significance. It would be important to capture what P-values are reported in biomedical papers and whether their patterns have changed over time. ObjectiveTo quantify the reporting dynamics on P-values in biomedical articles in PubMed and PubMed Central(PMC) database over a 35-year period (1990-2025). DesignData were retrieved from the National Library of Medicine via PubMed and PubMed Central (PMC, full-text articles included), fetching the entire accessible corpus. Records were computationally processed using a regular expression algorithm, validated for various mathematical formats, to extract reported P-values from the text. SettingThe study includes 22,734,796 PubMed abstracts, 6,031,459 PMC abstracts and 6,397,787 PMC full-texts. Main Outcomes and MeasuresProportion of article reporting at least 1 P-value, at least 1 P-value less than thresholds (.05 and .005), distribution of P-values by magnitude and operator type. ResultsThe proportion of articles reporting P-values increased from 7.5% in 1990 to 18.3% in 2025 for PubMed abstracts, and from 5.2% to 53.3% for PMC full-texts. The median number of P-values per article increased from 2 to 7 in PMC full-text articles, with upward trends observed in all databases. A high proportion of P-values remains clustered around .05 and .001 in all databases. The proportion of articles reporting at least one P-value [≤] .05 has remained in the range 94%-98% since 1998, while the proportion reporting at least one P-value[≤] .005 has increased over time, reaching 57.0% for PubMed abstracts and 62.5% for PMC full-texts. The reporting of exact P-values increased until 2015, but with no further increase in the last 10 years (PubMed abstracts: 17.6% in 1990, 51.1% in 2015, 49.8% in 2025) Conclusions and RelevanceOur evaluation demonstrates the pervasive entrenchment of P-values, despite heavy debates and major changes in the content of the biomedical literature over time. More P-values are reported and papers using P-values almost always report some that are statistically significant. Readers should remain aware of the major issues surrounding P-value misuse and misinterpretation. Key pointsO_ST_ABSQuestionC_ST_ABSWith continuing debate regarding the use and misuse of statistical significance, how have reported P-values evolved over the past 35 years? FindingsAcross over 22 million PubMed abstracts and over 6 million full-texts, reporting of P-values became more common over time. Almost all (94-98%) abstracts and full-text reporting P-values have at least one significant at the .05 threshold. The reporting of exact P-values increased until 2015 but plateaued since then. Clustering around traditional statistical significance thresholds remains consistent. MeaningP-values reporting has become more common over time, with pervasive prevalence of significant P-values across the biomedical literature.
Takeuchi, F.; Dona, M. S. I.; Ho, W. W. H.; Lambert, S. A.; Inouye, M.; Kato, N.
Show abstract
BackgroundDrug suitability is determined by safety, efficacy, and pathological appropriateness. The pharmacogenomics of drug suitability can be assessed by analyzing drug response and drug choice in large population cohorts. MethodsWe investigated drug response and drug choice for dyslipidemia and hypertension using genetic, phenotypic, and prescribing data from the UK Biobank and the All of Us Research Program. Drug response was reassessed with rigorous biomarker scaling, while genome-wide association studies (GWAS) and polygenic scores were used to examine genetic factors influencing drug choice. ResultsConventional analyses showed that variants influencing baseline LDL cholesterol (LDL-C) were inversely associated with absolute LDL-C change but concordant with relative change following statin therapy; these signals disappeared after applying a variance-stabilizing Box-Cox transformation, indicating a methodological artifact in biomarker scaling. GWAS for drug choice identified several significant loci and unique genetic correlation patterns with cardiometabolic traits. Polygenic scores for drug choice yielded statistically significant predictive performance, which was enhanced by incorporating demographic factors, though prediction strength in clinical settings remains modest. ConclusionVariance-stabilizing transformation corrects spurious pharmacogenetic associations introduced by biomarker scaling. Genetic variation informs drug choice for dyslipidemia and hypertension, but current polygenic scores provide only modest benefits in clinical application.
Yuan, Y.; Peng, Z.; Doi, S. A. R.; Furuya-Kanamori, L.; Cao, H.; Lin, L.; Chu, H.; Loke, Y.; Mol, B. W.; Golder, S.; Vohra, S.; Xu, C.
Show abstract
BackgroundThe number of problematic randomized clinical trials (RCTs) has risen sharply in recent decades, posing serious challenges to the integrity of the healthcare evidence ecosystem. ObjectiveTo investigate whether retraction of problematic RCTs could reduce evidence contamination. DesignRetrospective cohort study SettingA secondary analysis of the VITALITY Study database. Participants1,330 retracted RCTs with 847 systematic reviews. MeasurementsThe difference in the median number (and its interquartile, IQR) of contamination before and after retraction. The association between time-to-retraction and likelihood of evidence contamination. ResultsAmong these retracted RCTs, 426 led to evidence contamination, resulting in 1,106 contamination events (251 after retraction vs. 855 before retraction). The time interval between RCT publication and first contamination ranged from 0.2 to 30.9 years, with a median of 3.3 years (95% CI: 3.0 to 3.9). The median number of contaminated systematic reviews was lower after retraction than before retraction (0, IQR: 0 to 1 vs. 1, IQR: 1 to 2, P < 0.01). Compared with trials retracted more than 7.5 years after publication, those retracted between 1.0 and 1.8 years (OR = 0.70, 95% CI: 0.60 to 0.80) and retracted within 1.0 year (OR = 0.69, 95% CI: 0.60 to 0.80) were associated with lower likelihood of evidence contamination. LimitationsOnly assessed contaminated systematic reviews with quantitative synthesis and limited to retracted RCTs. ConclusionsRetracting problematic RCTs can significantly reduce evidence contamination, and faster retraction was associated with less contamination. To safeguard the integrity of the evidence ecosystem, academic journals should act promptly in the retraction of problematic studies to minimize their downstream impact. Primary Funding SourcesThe National Natural Science Foundation of China (72204003, 72574229)
Raveney, B. J.; Okamoto, T.; Kimura, A.; Lin, Y.; Araki, M.; Kimura, Y.; Sato, N.; Shimizu, Y.; Nishida, Y.; Yokota, T.; Maikusa, N.; Taketsuna, M.; Okada, Y.; Ishizuka, T.; Nakamura, H.; Miyake, S.; Takahashi, Y.; Sato, W.; Yamamura, T.
Show abstract
Multiple sclerosis (MS) therapies primarily rely on lymphocyte depletion or trafficking blockade, carrying risks of systemic immunosuppression; however, such treatments have limited efficacy in secondary progressive multiple sclerosis (SPMS). Thus, drugs that target stage-specific inflammation without broad immunosuppression are an unmet clinical need. In this double-blind, placebo-controlled phase II trial, 30 patients with relapsing MS received weekly oral OCH or placebo for 24 weeks. In the pre-specified SPMS subgroup (n=12), OCH achieved complete relapse prevention (p=0.0003), prolonged relapse-free survival (p=0.0079), no new lesions (0/6), with no evidence of disease activity (NEDA-3) in 5/6 patients. In comparison, for the placebo-treated group, 5/6 patients suffered relapses, 2/6 patients developed new lesions, and no placebo-treated SPMS achieved NEDA-3. Invariant natural killer T (iNKT) cells, a regulatory lymphocyte population that is numerically and functionally impaired in MS, are a potential target for MS therapy. Glycolipid OCH is a selective iNKT cell stimulator, skewing the cytokine environment towards Th2. OCH treatment resulted in increased IL-4-producing Th cells in patient peripheral blood while decreasing pathogenic GM-CSF-producing Th cells. Parallel studies in mouse models of MS (EAE) corroborated this mechanism and further revealed that OCH activated gut iNKT cells. Disease amelioration by OCH depended on IL-4 and its efficacy was further enhanced by depletion of B cells. These data revealed the gut-brain axis mediation of progressive-stage pathology distinct from relapsing-remitting MS. Findings from this bidirectional translational study uncover mechanistic differences between SPMS and other types of MS and highlight divergent roles for B cells and Th cells. Furthermore, OCH exerts its therapeutic benefit via targeting mechanisms that are distinct from currently available drugs; exploiting iNKT cell regulatory potential to reprogram pathogenic T helper responses without lymphocyte depletion. The unique yet effective nature of OCH treatment positions it as an attractive future oral therapy for SPMS. One Sentence SummaryThe iNKT cell activating ligand OCH suppresses disease activity selectively in secondary progressive MS in a phase II clinical trial, revealing stage-specific IL-4-mediated immune cell interactions in MS pathology.
Ebrahimi, A.; Wiil, U. K.; Olsson, T.; Kockum, I. S.; Lio, P.; Manouchehrinia, A.; Kiani, N. A.
Show abstract
BackgroundThe prodromal phase of multiple sclerosis (MS) is increasingly recognized, but most studies have focused on isolated symptoms or static comorbidity counts, leaving the evolving structure of pre-onset disease burden underexplored. ObjectiveTo characterize dynamic disease trajectories preceding MS onset through longitudinal network modeling. MethodsHealth data from 10,273 MS patients and 47,167 matched controls in Sweden were analyzed. Disease co-occurrence networks were constructed for three pre-onset windows (0-5, 5-10, 10-15 years), with comparisons of centrality, clustering, and path length. Rewiring scores captured structural shifts, while Markov clustering and trajectory mapping identified comorbidity communities. ResultsMS networks were denser, more clustered, and showed shorter path lengths than controls, reflecting higher systemic interconnectivity. Psychiatric and metabolic diagnoses, especially depression, anxiety, diabetes, and abdominal pain, were hubs that gained prominence over time. Distinct clusters, including neuropsychiatric-toxicological and immune-endocrine constellations, were observed only in MS. Rewiring analysis revealed significant topological shifts in key diagnoses, such as inflammatory CNS disorders and substance use, as onset approached. ConclusionsMS is preceded by dynamic reorganization of the comorbidity landscape, marked by increasing connectivity and rewired hubs. This framework highlights systemic disruption before diagnosis and provides a novel, network-based tool for studying prodromes in complex disorders.
Tolladay, J.; Yau, C.
Show abstract
BackgroundClimate change is increasingly recognised as a threat to population health and healthcare systems, yet the effects of environmental variability on pharmaceutical prescribing remain poorly characterised in the UK. Using a wide array of open-source datasets, we examine the effect of environmental, geographic and socioeconomic factors on prescribing habits in England. MethodsWe linked monthly, practice-level prescribing data for England (2010-2025) to meteorological, air-quality, flooding and demographic datasets using spatial nearest-neighbour matching. Prescribing volumes for cardiovascular, respiratory and antibiotic medications were analysed using log-transformed outcomes in mixed-effects models with practice-level random effects, adjusting for region, seasonality, deprivation and temporal trends, using both continuous environmental measures and extreme-condition indicators. A complementary Bayesian hierarchical model jointly estimated the conditional effects of multiple correlated environmental exposures, with partial pooling across practices and support for distributed lag effects. ResultsIn mixed-effects analyses, temperature showed the most consistent associations with prescribing, with higher temperatures linked to increased respiratory and cardiovascular prescriptions and reduced antibiotic use, while rainfall, flooding and most pollutants had small or negligible effects. Environmental predictors exhibited strong correlations, motivating multivariate modelling. Bayesian multivariate models confirmed temperature as the dominant environmental driver after adjustment for correlated exposures, with substantially larger variation attributable to regional and socioeconomic factors than to environmental conditions. ConclusionsTemperature is the most consistent environmental determinant of GP prescribing in England, with higher temperatures associated with increased cardiovascular and respiratory prescribing and reduced antibiotic use. Rainfall, flooding and most air pollutants show little evidence of meaningful effects once seasonal and meteorological structure is accounted for. Environmental associations are modest in magnitude relative to persistent socioeconomic and regional drivers of prescribing, indicating that climate-related influences operate within broader structural determinants of healthcare utilisation. These results suggest that, at monthly timescales, prescribing demand is relatively stable to environmental variability, supporting a focus on long-term adaptation and surveillance rather than short-term demand shocks in climate-resilient healthcare planning.
Motley, M. P.; Hobbs, M. M.; Waltmann, A.; Macintyre, A. N.; Duncan, J. A.
Show abstract
The host response to Neisseria gonorrhoeae is variable, and understanding its systemic and local components is critical to understanding anti-gonococcal immunity for vaccine development. We used a controlled human infection model of male gonococcal urethritis in naive volunteers in combination with multiplex cytokine analyte analysis of blood and urine specimens taken before infection, at the time of acute symptoms, and after curative treatment of N. gonorrhoeae to study responses to early infection. (This study utilized data and specimens from all 11 participants assigned to control arms of two previous randomized clinical trials). All 11 participants developed urethritis between 2 and 5 days post inoculation with N. gonorrhoeae strain FA1090, with a majority having visible discharge by day 3. In urine, we found increases in IL-1RA, G-CSF, and chemokines CXCL10, CCL4, CCL11, GRO/{beta}/{gamma}, and IL-8/CXCL8, with IL-1RA and CCL4 showing direct correlation with the degree of pyuria at the time of infection. Contrary to a prior study using the human challenge model and N. gonorrhoeae strain MS11mkC, we did not see similar increases in urine IL-6, TNF-, or IL-1{beta}, although differences in IL-6, TNF- were observed in participants with later development of infection. Additionally, plasma cytokine levels were unchanged in this cohort over the course of their infection, suggesting these infections were confined to the urethra. We propose that differences in strain virulence or the threshold to define a clinical case may be responsible for this discrepancy, meriting further study and continued use of non-invasive inflammatory markers to study local effects in addition to systemic effects of gonococcal infection. Author SummaryGonorrhea, caused by the bacterium Neisseria gonorrhoeae, remains a global public health concern, yet repeated infections are common and no vaccine is available. A key challenge for vaccine development is limited understanding of how the human immune system responds during early infection, when bacteria are confined to the urethra, vagina, or other mucosal sites. To address this gap, we studied immune responses in a controlled human infection model in which male volunteers with no prior exposure were experimentally infected with N. gonorrhoeae into their urethra. Immune signaling molecules were measured in urine and blood samples collected before infection, during symptoms, and after antibiotic treatment. All participants developed urethral inflammation within a few days of infection. We observed marked increases in multiple inflammatory cytokines in urine, some which correlated with the degree of neutrophils in their urine. In contrast, immune markers in the bloodstream remained largely unchanged. These findings suggest that early infection with the N. gonorrhoeae strain tested triggers a strong localized immune response without widespread systemic inflammation. Our results highlight the value of urine-based, non-invasive sampling and demonstrate the power of human challenge models for studying early immune responses that have been difficult to characterize in animal systems.
Korbmacher, M.; Myhr, K.-M.; Wergeland, S.; Wesnes, K.; Torkildsen, O.
Show abstract
ObjectiveTo replicate and extend recent findings suggesting that higher serum alpha-linolenic acid (ALA) levels are associated with reduced disease activity and progression in multiple sclerosis (MS). MethodsWe reanalysed clinical trial data from 85 people with MS, who had serum ALA, magnetic resonance imaging (MRI), and clinical (EDSS, PASAT) assessments collected for two years, with additional follow-up at 12-years. Linear and mixed models were used to assess the relationship between ALA and clinical and MRI outcomes. Mediation analyses tested whether ALA mediated associations between brain volume or T2 lesion load, and disability. ResultsALA measures were consistent over time ({kappa}= 0.83). Higher ALA predicted lower EDSS ({beta} = -0.41, 95% CI [-0.73, -0.08]) and larger brain volume ({beta} = 0.22, 95% CI [0.09, 0.36]). ALA was a non-significant mediator of brain volume or lesion effects on EDSS and did not predict long-term clinical or cognitive change. DiscussionWe replicate prior associations between higher serum-ALA levels and reduced disability in MS and extend these by showing a beneficial association of serum-ALA with brain volume. However, ALA did not predict long-term progression, limiting its prognostic value.